Spatiotemporal Gene Networks from ISH Images
نویسندگان
چکیده
As large-scale techniques for studying and measuring gene expressions have been developed, automatically inferring gene interaction networks from expression data has emerged as a popular technique to advance our understanding of cellular systems. Accurate prediction of gene interactions, especially in multicellular organisms such as Drosophila or humans, requires temporal and spatial analysis of gene expressions, which is not easily obtainable from microarray data. New image based techniques using in-situ hybridization(ISH) have recently been developed to allow large-scale spatial-temporal profiling of whole body mRNA expression. However, analysis of such data for discovering new gene interactions still remains an open challenge. This thesis studies the question of predicting gene interaction networks from ISH data. First, we present SpEX2, a computer vision pipeline to extract informative features from ISH data. Next, we present an algorithm, Gin-IM, for learning spatial gene interaction networks from embryonic ISH images at a single time step. Gin-IM combines multi-instance kernels with recent work in learning sparse undirected graphical models to predict interactions between genes. Next, we propose NP-MuScL (nonparanormal multi source learning) to estimate a gene interaction network that is consistent with multiple sources of data, having the same underlying relationships between the nodes. NP-MuScL uses the semiparametric Gaussian copula to model the distribution of the different data sources, with the different copulas sharing the same covariance matrix. Finally, we propose to extend NP-MuScL to also learn the importance or contribution of each data source in predicting the gene interaction network, using a few examples of known gene interactions. Weighted NP-MuScL uses the hinge loss to learn the contribution of each data source, with a KL-divergence penalty that encourages the weights to be similar to a prior for each data source. We also discuss reasonable priors for such a model. We propose to apply our algorithms on more than 100,000 Drosophila embryonic ISH images from the Berkeley Drosophila Genome Project. Each of the 6 time steps in Drosophila embryonic development is treated as a separate data source. With spatial gene interactions predicted via Gin-IM, and temporal predictions combined via weighted NP-MuScL, we will finally be able to predict spatiotemporal gene networks from these images.
منابع مشابه
GINI: From ISH Images to Gene Interaction Networks
Accurate inference of molecular and functional interactions among genes, especially in multicellular organisms such as Drosophila, often requires statistical analysis of correlations not only between the magnitudes of gene expressions, but also between their temporal-spatial patterns. The ISH (in-situ-hybridization)-based gene expression micro-imaging technology offers an effective approach to ...
متن کاملDrosophila Gene Expression Pattern Annotations via Multi-Instance Biological Relevance Learning
Recent developments in biology have produced a large number of gene expression patterns, many of which have been annotated textually with anatomical and developmental terms. These terms spatially correspond to local regions of the images, which are attached collectively to groups of images. Because one does not know which term is assigned to which region of which image in the group, the develop...
متن کاملFuncISH: learning a functional representation of neural ISH images
MOTIVATION High-spatial resolution imaging datasets of mammalian brains have recently become available in unprecedented amounts. Images now reveal highly complex patterns of gene expression varying on multiple scales. The challenge in analyzing these images is both in extracting the patterns that are most relevant functionally and in providing a meaningful representation that allows neuroscient...
متن کاملSPEX2: automated concise extraction of spatial gene expression patterns from Fly embryo ISH images
MOTIVATION Microarray profiling of mRNA abundance is often ill suited for temporal-spatial analysis of gene expressions in multicellular organisms such as Drosophila. Recent progress in image-based genome-scale profiling of whole-body mRNA patterns via in situ hybridization (ISH) calls for development of accurate and automatic image analysis systems to facilitate efficient mining of complex tem...
متن کاملInferring Gene Interaction Networks from ISH Images via Kernelized Graphical Models
New bio-technologies are being developed that allow highthroughput imaging of gene expressions, where each image captures the spatial gene expression pattern of a single gene in the tissue of interest. This paper addresses the problem of automatically inferring a gene interaction network from such images. We propose a novel kernel-based graphical model learning algorithm, that is both convex an...
متن کامل